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Statistical Methods for Quality and Reliability Sectional Committee, MSD 03 


FOREWORD 


This Indian Standard (Part 1) (Third Revision) was adopted by the Bureau of Indian Standards, after the draft 
finalized by the Statistical Methods for Quality and Reliability Sectional Committee had been approved by the 
Management and Systems Division Council. 


Numerical data arise in modern industrial operations at a number of stages, for example, in the development of 
new methods or new products, in reviewing past experience for setting up standards, in assessment of quality of 
incoming materials in controlling quality during the manufacturing processes, in inspection and testing of end 
products, in surveys of performance in actual use, and in the study of consumer preferences. It is important to 
ensure that only such data as are really useful and necessary are collected and most effective use is made of them. 


Once the requisite data are collected, it becomes imperative that the same are properly condensed graphically or 
pictorially. Accordingly, this Part 1 deals with the tabulation and summarization of the statistical data where as 
Part 2 deals with the diagrammatic representation of data. 


This standard is published in two parts. The other part in the series is : 
Part 2 Diagrammatic representation of data 


This standard was originally issued in 1974 with a view to provide the procedures for recording and summarization 
of data and computation of quantitative measures of central tendency and dispersion. The first revision was taken 
up in 1981 with a view to removing lacunae observed while making class intervals. The second revision was taken 
up in 1989 with a view to deleting the clause on ‘normal distribution’ so as include the same in IS 9300 (Part 2) : 
1979 ‘Statistical models for industrial applications : Part 2 Continuous models’. Various examples on preparation 
of frequency table, cumulative frequency tables, and calculation of mean and standard deviation were changed by 
taking the live data from the industry. 


The third revision of the standard has been undertaken to, 


a) incorporate minor technical modifications with regard to number of observations for plotting frequencies 
and formation of class intervals, 

b) streamline the standard with others published in the meantime, and 

c) incorporate many editorial corrections. 


In reporting the result of a test of analysis, if the final value, observed or calculated, is to be rounded off, it shall 
be done in accordance with IS 2 : 1960 ‘Rules for rounding off numerical values (revised)’. 
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Indian Standard 


PRESENTATION OF STATISTICAL DATA 
PART 1 TABULATION AND SUMMARIZATION 


(Third Revision ) 


1 SCOPE 


This standard (Part 1) outlines the procedures of 
recording the data, preliminary scrutiny for elimination 
of discrepancies and mistakes in data, summarization 
of data by means of histograms, frequency distribution 
and computation of quantitative measures of central 
tendency and dispersion. 


2 REFERENCES 


The standards given below contain provisions which, 
through reference in this text, constitute provisions of 
this standard. At the time of publication, the editions 
indicated were valid. All standards are subject to revision, 
and parties to agreements based on this standard are 
encouraged to investigate the possibility of applying the 
most recent editions of the standards indicated below: 


IS No. Title 
7200 Presentation of statistical data: Part 2 
(Part 2): 1975 Diagrammatic representation of data 
7920 (Part 1): Statistics — Vocabulary and symbols: 
2012 Part 1 General statistical terms and 
terms used in probability (third 
revision) 
IS/ISO 16269-4 : Statistical interpretation of data — 
2010 Detection and treatment of outliers 


3 TERMINOLOGY 


For the purpose of this standard, the definitions given 
in IS 7920 (Part 1) shall apply. 


4 RECORDING OF DATA 


A proforma should be devised to record the items of 
information required together with certain ancilliary 
information needed for identification and verification. 
The units of measurement and the degree of accuracy 
of figures needed should be clearly specified before- 
hand. Instructions for filling in the proforma, 
particularly the marginal observations, should be clearly 
stated as footnotes at the bottom or reverse of the 
proforma. The sequence in which the observations are 
made should be preserved. The date and time of 
observation, the observers name or identity number, 
and the instruments used should be noted. As an 
illustration, the recording of data for moisture content 
of biscuits is presented in Annex A. 


5 SCRUTINY 
ERRORS 


AND MINIMIZATION OF 


Discrepancies and errors may arise in the course of 
recording of the basic data, summarisation, calculation 
of statistical measures of location and dispersion and 
presentation of final results. As far as possible, built in 
checks should be provided at each of these stages. The 
persons handling the data should be suitably trained. 
An extreme observation should be carefully examined 
before acceptance or rejection and helpful guidance in 
this regard may be obtained from IS/ISO 16269-4. 


6 FREQUENCY DISTRIBUTION 


6.1 It is desirable to present the original data in a 
suitable form as in Annex A for clearer understanding. 
However, in many situations this may not be possible 
and another method for summarization of data may 
become imperative. Even when the data can be 
presented in full, it is rarely that the results or 
conclusions can be obtained without condensation of 
data and further analysis. 


6.2 One simple and very useful way of summarizing 
numerical data is by preparing a frequency table. In 
the case of attributes type of data, the summarisation is 
done by noting the number of items in the two 
categories, namely, good and bad, so that the sum of 
the items falling in these two categories is equal to the 
total number of items inspected. 


6.3 In the case of discrete or continuous variables, the 
summarisation is achieved by forming classes within 
which the individual observation differs little from one 
another. The full set of data can then be replaced by a 
table which shows the number of original observations 
falling within each class. 


6.4 No rigid rules can be laid down in the formation of 
class intervals, but the following broad guidelines may 
be helpful in most of the practical situations: 


a) Fora frequency to show up any definite pattern, 
the total number of observations should not be 
less than 50 (preferably 100 or more). 

b) As far as practicable the intervals should be 
of equal width for better graphical 
representation and easier computation of the 
mean, standard deviation, etc. 
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с) 


From the maximum and the minimum values 
found in the data, the range (R) may be 
obtained and divided into suitable number of 
class intervals of equal width. The total 
number of class intervals (k) may vary from 7 
to 15 depending on the nature of the data and 
the number of observations generally Vn, 
where n is the number of observations. The 
relationship is as follows : 


Number of class _ Integral | Капре (К) | 


intervals (С) 


d) 


е) 


с рап ОЁ | Class width (С) 
Once the value of the range of the entire data 
is known and also that the total number of class 
intervals (А) should be from 7 to 15, the class 
width (C) may be obtained from the above 
relation. 

It is not advisable to increase unnecessarily 
the number of class intervals so as to 
accommodate one or two extreme 
observations. Such observations can be 
classified conveniently in the first (last) class 
interval which may be written as ‘less (greater) 
than or equal to a certain value’ leading to first 
(last) as open class interval. The class interval 
should neither be so large that considerable 
error is introduced in further calculations, nor 
should it be too small to result in too many 
class intervals having zero or very small 
frequencies. 

The class interval should be defined in sucha 
way that each observation belongs to one and 
only one class. This may be achieved by 
specifying class limits to one more decimal 
place than that obtained in the set of 
observations to be classified. Alternatively, 
some arbitrary rule can be fixed in advance 
with regard to the allocation of observations 
coinciding with the class limits. This rule shall 
be spelt out before summarisation is 
undertaken. 

It is desirable to choose the class intervals in 
such a way that the mid-point of a class 
intervals is a convenient figure for calculation 
and plotting. Where there is a tendency for 
certain values ( for example, multiples of 5 or 
10) to occur much more frequently than the 
neighbouring values, the classes should be so 
arranged that these values occur at about the 
middle of the corresponding intervals. 


Observations, which are to be grouped, should 
not be rounded off before grouping, since 
grouping is a form of rounding and successive 
rounding is liable to lead to systematic errors. 


6.5 The method of grouping observations may be 
illustrated by summarising the data given in Annex A. 
The first step is to decide what intervals to use. In this 
example, the minimum and the maximum values from 
the data are obtained as 0.7 and 5.3, respectively. 


Range (К) = 5.3 – 0.7 = 4.6 
Number of class intervals (А) = integral part 


4.6 
of T +1 


If C= 0.5, then k= 10 


Hence, if 0.5 is chosen as the width of each class 
interval, then there will be 10 class intervals. 


As the values of moisture content are given upto first 
place of decimal, the class intervals may be formed 
with two places of decimal so that each observation 
belongs to one and only one class interval. Also, as 
minimum value in the data is 0.7 and the class width 
has been chosen as 0.5, the first class interval may be 
formed as 0.55 — 1.05. Similarly, the subsequent class 
intervals may be formed with class width of 0.5 till the 
maximum value is also included in the class interval 
(see col 1 of Table 1). 


For each observation in the set of data, a tally mark is 
put against the relevant class interval. For ease of 
counting, these marks are arranged in blocks of 5, the 
fifth tally mark being drawn across the other four. Table 1 
shows the arrangement. When every observation has 
been allocated to a class, the tally marks in each class 
are counted to get the frequency of that class. 


6.6. Frequencies may be expressed either as actual or 
as relative frequencies. The relative frequencies may 
be expressed in decimals or as percentages of the total 
(see Table 2). 


6.7 For some purposes it is specially useful to know 
the number of observations which are less than (or 
greater than) particular values. A table which gives 
frequencies in this form is called a cumulative frequency 
table. A cumulative frequency table is built up readily 
from a frequency table as follows: 


In ‘less Шап” cumulative tables, the frequency of the 
first class interval is written against the upper limit of 
the first class interval. The frequency of the second 
class interval is added to that of the second class interval 
and their sum is recorded against the upper limit of the 
second class interval. Each class frequency is added 
intern and the cumulative sum is recorded at each step 
until the whole table is completed (see table 3). To 
obtain the cumulative table if observations greater than 
the specified values, the procedure is as described 
except that a start is made at the bottom of the table 
(corresponding to the last class interval) ( see table 4). 
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Table 1 Frequency Table 
(Clauses 6.5 and 7.2) 


Class Interval Tally Marks Frequency 

(1) (2) (3) 
0.55— 1.05 уыс». 8 
1.05 – 1.55 W мм) 16 
1.55 – 2.05 ум. TRL Уми им II 32 
2.05 — 2.55 ма а ന IN. നഗ ന്ധ 41 
235—305 HL TAL TK TH тетет 52 
3.05 - 3.55 у KL IU IU. IU ТӨРТЕУ) 43 
3.55 — 4.05 TL ML ML TM TL I 34 
4.05 — 4.55 TH Te MCI 17 
4.55 – 5.05 Учи 7 
Тота! 250 


Table 2 Actual and Relative Frequency Distribution 
(Clauses 6.6 and 7.3.1) 


Class Intervals Frequency 
Actuals Relative (Percentage) 

(1) (2) (3) 
0.55 - 1.05 8 3.2 
1,05 - 1.55 16 6.4 
1.55 – 2.05 32 12.8 
2.05 – 2.55 41 16.4 
2.55 – 3.05 52 20.8 
3.05 – 3.55 43 17.2 
3.55 – 4.05 34 13.6 
4.05 – 4.55 17 6.8 
4.55 — 5.05 7 2.8 
Total 250 100.0 


Table 3 'Less Трап" Cumulative Frequency Distribution (Actual and Percentage) 
(Clause 6.7) 


Moisture content (Percent by Mass) Number Below the Given Limit Percentage Below the Given Limit 
(1) (2) (3) 
1.05 8 32 
1:55 24 9.6 
2.05 56 22.4 
2.55 97 38.8 
3.05 149 59.6 
3.55 192 76.8 
4.05 226 90.4 
4.55 243 97.2 
5.05 250 100.0 


Table 4 ‘More Than’ Cumulative Frequency Distribution (Actual and Percentage) 
(Clause 6.7) 


Moisture Content (Percent by Mass) Number Below the Given Limit Percentage Below the Given Limit 
(1) (2) (3) 
0.55 250 100.0 
1.05 242 96.8 
1.55 226 90.4 
2.05 194 77.6 
2.55 153 61.2 
3.05 101 40.4 
3.55 58 23.2 
4.05 24 9.6 
4.55 7 2.8 
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NOTE — In the case of ‘less than’ cumulative tables, the upper 
limit of class intervals are to be used, whereas for ‘more than’ 
cumulative tables, the lower limits of class intervals are relevant. 


6.7.1 Figure | depicts the ‘less than’ and ‘more than’ 
cumulative curves for the relative frequencies. These 
two curves intersect at a point, the abscissa of which 
corresponds to the median. 


7 HISTOGRAMS AND FREQUENCY CURVES 


7.1 The implication of a series of observations or data 
becomes easier to grasp when presented graphically as 
compared to a tabular presentation. A simple way of 
doing this is to plot the data in the order in which the 
individual observations were obtained using time as 
abscissa and measured variables as ordinate. Such a 
graphical presentation, though may be satisfactory when 
the number of observations is small (say less than 30), 
will be unwieldy, if there are a large number of 
observations to be presented. Besides, the underlying 
pattern of variation in the lot will not be brought out in 
such a presentation. 


7.2 To present a large number of observations in a graph 
of convenient size, condensation of data will be 
necessary. This can be achieved by means of frequency 
table (see Table 1). In this presentation, some details 
are lost specially the identity of individual observations 
as also the sequence of their occurrence; but the 
underlying pattern of variation may become clearer. 


7.3 A frequency distribution may be simplified further 
and presented as a histogram which consists of series 
of rectangles with the range of the variate as base and 
height proportional to the corresponding frequency 
(see Fig. 2). For drawing a histogram, the class 
intervals of equal width are taken on x-axis with 
corresponding frequencies on y-axis. When class 
intervals are of unequal width, the histograms are 
drawn by making suitable adjustment in the 
dimensions of the rectangles ensuring that the areas 
of the rectangles are proportional to the corresponding 
frequencies. For this purpose, the frequencies may be 
divided by the respective widths of the class intervals 
and the quotients used as the heights of the rectangles. 
Uses of histogram have been explained with practical 
examples in Annex B. 


Sometimes, instead of presenting the data in the form 
of a histogram, it is preferred to draw the frequency 
polygon. This is constructed by drawing the line 
segments joining the mid-points of the top of each 
rectangle of the histogram. A typical frequency polygon 
for the data in Table 2 is given in Fig. 3. 


7.4 When the number of observations increases and 
width of class intervals reduces, the number of 
rectangles increases and the outline of the histogram 
becomes more regular. It is easy to visualize that if this 


process is continued, then the frequency polygon drawn 
would tend to become smooth curve. Such a curve is 
known as frequency curve (see Fig. 4). The area 
encompassed by the x-axis and the curve between any 
two ordinates will be proportional to the frequency of 
observations between the two values represented by 
the ordinates. Frequency curve can represent either 
actual frequencies or relative frequencies. 


7.5 For a detailed discussion of the various forms of 
diagrammatic representation of data see IS 7200 (Part 2). 


8 CHARACTERISTIC MEASURES ОЕ 
FREQUENCY DISTRIBUTION 


8.1 The two important characteristics of frequency 
distribution are: 


a) Central tendency, and 
b) Spread or dispersion. 


8.1.1 The most widely used measures of central 
tendency are the mean and the median. In some 
situations, mode is also used as a measure of tendency. 


8.1.2 The two most useful measures of spread are the 
range and the standard deviation. 


8.2 Measures of Central Tendency 
8.2.1 Mean 


The mean is obtained by dividing the sum of the 
observations by the number of observations. 
Symbolically, if x,, x5,...., Ха denote the п individual 
observations, then the mean denoted by х (read ‘x- 
bar’) is given by the formula: 

= ly 

Х-х +X, +...4+%, "ды 
For example, if the values of percentage elongation at 
rupture of 8 samples from a batch of copper wire were 
obtained as 15.0, 16.0, 16.4, 16.7, 20.9, 19.6, 17.1 and 
19.1, the mean elongation is: 


та 15.0+...19.1 14.8 
8 8 
8.2.2. Weighted Mean 


= 17.6 per cent 


The weighted mean of the values x,, х›,.....х, having 
weights у, W3,....w, respectively is given by the 
formula: 


т- നയം, 


w +w,+...W,, 


For example, in a sampling investigation conducted to 
determine the quality of a lot of gypsum, five gross 
samples were collected and analysed for calcium 
sulphate percentage. The test results obtained were: 


ЕӘУД БЕРЕСІ 
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89.09, 86.02, 83.38, 88.69 апа 91.76 
representing respectively the tonnages of 
89.50, 89.00, 79.25, 72.25 and 72.50. 


The quality of lot given by weighted mean is calculated 
as follows: 


X = WX, + WX, +.... 


| 
ГИ Ха 
wi E w, +... + Ww, 


At е end incomplete 


х = 89.50х89.09 + 89.00х86.02 + 79.25х83.38 + 
72.25 х 88.69 + 72.50 x 91.76 
89.50 + 89.00 +79.25 +72.25 +72.50 
= 35 297.65 = 87.70 percent 


8.2.3 Median 


Median is the value which divides the series if 
observations arranged in the order of magnitude so that 
the number of observations less than the median is equal 
to the number of observations greater than the median. 
Thus, if the number of observations is odd, then the 
median is the middle most value when the observations 
are arranged in ascending or descending order. 
However, if the number of observations is even, then 
by convention the median is defined as the mean of the 
two middle most observations. 


Thus in a set of 5 observations 11, 12, 14, 16 and 17, 
the value of the median is 14. In another set of ten 
observations given by 8, 9, 9, 10, 11, 12, 12, 13, 16 
and 19, the value of the median is obtained as 
(11+ 12)/2 = 11.5. 


In certain situations, ће median is a better measure of 
central tendency than the mean since the former is not 
very much affected by the extreme values found in the 
data. It has the added advantage if being very simple to 
calculate especially when the number of observations is 
odd. On the other hand, mean has the representative 
character of the whole data since all the values enter 
directly in its computation and has extensive applications 
in statistical theory. However, in the case of small samples 
usually encountered in the controlled short techniques 
used in process control, both are equally efficient. 


8.2.4 Mode 


Mode is that value of the variable which has the highest 
frequency. This measure of central tendency, though 
the easiest of the three (mean, median and mode) to 
find from a frequency distribution, has fewer 
applications in quality control. 


8.3 Measures of Dispersion 
8.3.1 Range 


The range is the difference between the largest and the 
smallest observation obtained in a set of data. 
Symbolically, R = Хах- Xmin Where К denotes the 
range, Xmax 15 the largest observation and X nin is the 
smallest observation. 
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8.3.2. Sample Standard Deviation 


This is defined as the square root of the quotient 
obtained by dividing the sum of squares of deviations 
of the observations from their mean by | less than the 
number of observations in the sample. 


Symbolically, Ех, х,,...х, are the n observations with 
the mean value, ¥ is the standard deviations are given 
by the formula: 


For the data given in 8.2.1.the standard deviation 15: 


2 
2 507.04 – ШАЛУ; 


z 8% |.(414 
5 = ( 


)2-2.03 


8.3.3 Pooled Standard Deviation 


8.3.3.1 The pooled standard deviation for k samples of 
sizes n; 1 = 1, 2,..., k and standard deviation s; i = 1, 
2,..., k respectively is defined as follows: 

1/2 


ЖС -1)%2 


i=l 
For example a lot of sized iron ore was divided into 
three sub-lots for determining iron ore content. The 
number of unit samples taken from each subtotal were 
5, 6 and 8 with corresponding standard deviation values 
(in percent) 1.02, 1.04 and 0.96. The pooled standard 
deviation value of the lot is calculated as follows: 


5 Я 27/2 
КЕ [ +5 (1.04)? +7х(0.96) 
(5+6+8)-3 

s = (1.001 3)!” 

5 = 1.00 
In case of k samples are drawn from k different 
populations having same standard deviation, the value 
of the pooled standard deviation is given by: 
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Where x; and 5, are the mean and sample standard 
deviation of i" sample and х is the pooled mean of all 
the k samples. 


8.3.4. the range is nearly as ‘efficient’ as the standard 
deviation as the estimate of the lot standard deviation 
when the sample size is small, say less than 10. Because 
of its simplicity the range is extensively used in control 
chart work. However, as the sample size increases 
beyond 10 the efficiency of the standard deviation 
becomes much greater than that of the range which is 
easily affected by the extreme values the exist in set of 
data. 


8.3.5. The percent coefficient of variation CV is 
defined as follows: 


1 
oye я 
x 
where 
s = sample standard deviation, and 
x = absolute mean value. 


The coeficient of variation of the data given in 8.2.1. is: 


17.6 


9 COMPUTATION OF MEAN AND STANDARD 
DEVIATION FROM SAMPLE FREQUENCY 
DISTRIBUTION 


СУ = х 100 = 11.53 percent 


9.1. The computation of the mean and standard 
deviation from the larger body of data (say, containing 
more than 50 observations) can be carried out with 
considerable economy of labour and with a degree of 
accuracy adequate for all practical purposes by using a 
frequency distribution. 


9.2. If the values х}, x5,..., х, are occurring with the 
frequency fi, 5,.../, respectively, the mean value is : 


n 
Зла 
ы Лю 2 ++ Ада _ 


Ағы. Уу; 
i=l 


The standard deviation for this type of data is obtained 
as: 


2 1/2 


Жарат +.. falan) 
(Ж +...+ЛҺ)—1 


5 = 


2 
я (Ул) 
Бе са 
i=l S 


гі 


Ул 
і=1 


NOTE: when the total number of observations is large (тоге 
than 50) the value of the sample standard deviation will not 
differ much even if the sum of square of deviations from mean 
is divided by the total number of observations instead of one 
less than the total number of observations. 


9.3 In the case of frequency distribution for a continuous 
variable with n class intervals of equal width the 
calculation of mean and standard deviation may be done 
by the same formulae as those given in 9.2 by replacing 
Xis X2... X, by the mid value of the class intervals. 


9.4. The computation of mean and standard deviation 
can by simplified considerably by suitably transforming 
the original variable and then working with the new 
transformed variable. It is advisable to subtract from 
each of the mid-values of the class intervals that mid- 
value (say хо) which has the maximum frequency and 
then divide each such difference by the width of the 
class interval (с) so that the transformed valued d; = 
(х, хо)/с take negative, zero and positive digital values 
in a continuous sequence. The mean and standard 
deviation of the transformed are then calculated by 
using the formulae given in 9.2. The mean and standard 
deviation for the original data is calculated as: 


Р 2; 
і-і 
DA 
i=l 


Standard deviation (s) = 


2 
| (Ула) 
== 
i=l Уу 

i=l 


Ула 
i=l 


9.4.1. The steps in the computation of mean and 
standard deviation by transforming of variables are 
illustrated in Table 5. Columns 1 and 3 reproduce the 
frequency distribution of Table 1. Column 2 gives the 
mid-value of the class intervals and for all further 
calculations it would be assumed that these mid-values 


Mean (х) =x) +c 


ех 
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occur with the frequencies indicated in column 3. as the class width which is 0.5 columns 5 and 6 give 
Column 4 gives the transformed variables obtained by е values of fd; and fid. A typical proforma used for 
using a convenient working origin and working unit. making frequency table and for computation of mean 


In the present case, the working origin is 2.8 which is and standard deviation from frequency data is given in 
the mid-value of the class interval 2.55-3.05 having Annex С. 
the maximum frequency. The working unit is the same 


Table 5 Computation of Mean and Standard Deviation from Frequency Data 
(Clause 9.4.1) 


Class Interval Mid Value Frequency Transformed Value 

(1) (2) (3) (4) (5) (6) 
0.55.1.05 0.8 8 -4 -32 128 
1.05.1.55 1:3 16 -3 -48 144 
1:55:2:05 1.8 32 -2 -64 128 
2.05.2.55 23 40 -1 -40 40 

-184 
2.55.3.05 2.8 52 0 0 0 
3.05.3.55 33 43 1 43 43 
3.55.4.05 3.8 32 2 64 128 
4.05.4.55 4.3 16 3 48 144 
4.55.5.05 4.8 7 4 28 112 
5.05.5.55 9:3 4 5 20 100 

203 

Total 250 203 — 184 = 19 967 
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ANNEX A 
(Clauses 4.6.1 and 6.5) 
MEASUREMENT OF MOISTURE IN BISCUITS 
IS No. : 1011: 2002 
Specifications : 6.0 (Мах) 


Serial Moisture Serial Moisture Serial Moisture Serial Moisture Serial Moisture 
Number (Percent Number (Percent Number (Percent Number’ (Percent Number (Percent 


by Mass) by Mass) by Mass) by Mass) by Mass) 
(1) (2) (1) (2) (1) (2) (1) (2) (1) (2) 
1 0.8 51 2.0 101 4.0 151 2.2 201 1.4 
2 1.8 52 1.8 102 3.6 152 2.2 202 3.0 
3 2.1 53 1.8 103 4.6 153 2.8 203 3.3 
4 3.7 54 0.8 104 3.3 154 2.9 204 2.0 
5 2.9 55 23. 105 3.4 155 2.7 205 1.8 
6 4.6 56 43 106 4.8 156 2.0 206 32 
7 4.6 57 0.9 107 12 157 1.8 207 3.1 
8 4.2 58 2.0 108 2.1 158 1.8 208 3.6 
9 12 59 3.4 109 1.9 159 0.8 209 34 
10 2.8 60 3.8 110 1.6 160 2.3 210 3.6 
11 1.2 61 2.6 111 4.3 161 4.8 211 3.4 
12 3.6 62 4.3 112 22 162 3.7 212 3.2 
13 2.7 63 1.9 113 2.8 163 4.1 213 2.6 
14 2.8 64 4.1 114 3.6 164 3.9 214 2.8 
15 1.3 65 2.8 115 3.5 165 44 215 2.6 
2:5 
16 3.6 66 4.0 116 2.9 166 1.7 216 2.7 
17 2.8 67 3.6 117 3.8 167 3.4 217 1.8 
18 1.4 68 4.5 118 4.6 168 1.5 218 27 
19 1.9 69 3.4 119 2.2 169 4.3 219 2.6 
20 2.2 70 2.7 120 2:5 170 220 2.8 
2.8 
21 3.9 71 2.9 121 3.2 171 2.9 221 3.7 
22 3.8 72 227 122 3.1 172 2.0 222 2.9 
23 3.8 73 3.0 123 1.6 173 2.7 223 2.8 
24 2.6 74 3.4 124 32 174 3.7 224 3.7 
25 2.2 75 3.3 125 1.1 175 225 3.2 
3.0 
26 2.2 76 3.6 126 34 176 3.6 226 1.1 
27 3.1 77 1:5 127 1.7 177 0.7 227 3.1 
28 227 78 2:5 128 2.4 178 1.4 228 2.2 
29 2.5 79 2.1 129 3.4 179 1.8 229 2.3 
30 3.4 80 2.4 130 3.0 180 230 3.6 
2.0 
31 2.8 81 3.2 131 3.4 181 1.0 231 3.8 
32 2.9 82 41 132 3.7 182 0.9 232 3.6 
33 2:5 83 4.1 133 3.5 183 1.5 233 3.2 
34 2.4 84 3.4 134 3.4 184 1.0 234 2.0 
35 3.3 85 1:9 135 2.7 185 235 2.8 
12 
36 3.8 86 2.0 136 2.8 186 2.3 236 99 
37 3,2 87 2.8 137 2,9 187 34 237 2.3 
38 2.3 88 2.4 138 3.5 188 3.8 238 3.8 
39 3.3 89 4.0 139 2.6 189 4.3 239 1.8 
40 3.5 90 2.6 140 3.1 190 240 2.6 
2.0 
41 3.6 91 1.7 141 2.4 191 34 241 2.0 
42 3.0 92 1.4 142 2:5 192 1.6 242 2.4 
43 4.3 93 2:2 143 227 193 2.5 243 22. 
44 3.3 94 1.7 144 0.7 194 2:5 244 3:1 
45 4.1 95 1.8 145 2.0 195 245 2.8 
2.3 
46 2.5 96 2.1 146 4.8 196 1.2 246 1.8 
47 3.8 97 4.1 147 3.7 197 2.1 247 2.2 
48 3.8 98 2.2 148 1.6 198 3.0 248 4.5 
49 2.3 99 2.9 149 3.6 199 2.5 249 3.1 
50 4.1 100 4.7 150 2.8 200 250 3.4 
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АММЕХВ 


(Clause 7.3) 
USES ОЕ HISTOGRAM 


B-1 ADVANTAGES OF HISTOGRAM 


A histogram can be effectively used for process control. 
It has the following advantages: 


a) It gives the graphical representation of the 
distribution. 

b) Mean and standard deviation of the 
distribution can be calculated easily; and 


c) Histograms after stratification are useful to 
obtain clues for process improvement. 


B-2 PRACTICAL APPLICATIONS 


B-2.1 Identification of Problems Faced by the 
Distribution 


B-2.1.1 As explained in 7.4, after a histogram is 
prepared, the midpoint of each class interval is 
connected by free hand and a smooth curve is drawn 
without following the unevenness of the histogram. If 
the distribution is normal, this curve usually exhibits a 
bell-like form (see Fig 4). 


In contrast, non- normal distribution curves such as in 
A-E of Fig 5. are sometimes obtained. Attention shall 
be paid to the following points in the identification of 
problems on the basis of these histograms: 


a) When there is no symmetry, this type of 
distribution occurs. For example, in the 
distribution of particle size of granules 
produced by grinding , the number of large- 
size granules decreases rapidly while the small 
size granules exhibit a long-tailed distribution. 
This is because the number of large granules 
is small in the raw material and fine ones occur 
in a natural way even before the grinding 
operation. 

b) When different types of data are mixed, this 
kind of distribution arises. The shaped 
suggests that stratification may be possible 
operator- wise or instrument-wise. 

c) This type of distribution often appears when 
measurements are taken by an inaccurate 
method or the class-width is not an integral 
multiple of the measuring unit or any other 
similar factor or cause depending upon the 
situation. 

d) This is a truncated distribution which arises 
when measurements of only those items which 
meet the specifications are recorded or when 
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the inspection department makes allowances 
for substandard articles to meet the required 
quantity. It also appears when the data below 
a certain level are removed intentionally or 
when saturation is reached in a chemical 
reaction because there are no data in the region 
beyond the saturation point. 

This type if distribution is attributable to 
differences in quality of raw materials or 
process abnormalities. 


A 
дА 
га 
A 
АД 
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B-2.2 Comparison with Specification and Target 
Values 


B-2.2.1 When specification limits and target values are 
known, these shall be entered in the histogram and it 
shall be examined whether the actual data distribution 
is satisfactory. The mean should be as close to median 
as possible. 


B-2.2.2 Comparison of the histogram with specification 
limits is explained by taking A-G in Fig. 6 as examples. 


a) The distribution of data is within the 
specification limits with a considerable margin 
on both sides. One of the measures is to make 
the tolerance tighter to improve the quality of 
the product. 

The control limits of the process may be 
relaxed if reduced cost or increased 
productivity can be expected. 

The distribution of data is within the 
specification limits, but the mean (и) is shifted 
towards the lower specification limit. 
Therefore, slight decrease in the process 
average may result in off specification items. 
When this type of distribution appears, steps 
Shall be taken to increase the process mean. 


NOTE - The mean (д) could likewise be shifted to upper 
specification limit, requiring adjustment of decreasing 
the process mean. 


b) 


c) The distribution of data is barely within the 
specification limits. Even a slight change of 
the mean (൮) may produce off specification 
items. When this type of distribution appears, 
measures shall be taken to reduce The 
variability of the process ог Ше 
appropriateness of the specification limits 
shall be reviewed. 

The distribution if data is beyond the 
specification limits. When this type of 
distribution appears, action shall be taken 
immediately. Taking into consideration the 
quality required by the user and the capability 
of the processes the appropriateness of the 
specification limits shall be reviewed at first. 
If the quality required by the user and the 
process capability can be compromised when 
the specification range is widened to 8 times 
the standard deviation, revision of the 
specifications is one of the solutions. If the 
shape of the histogram is considerably flat 
with no distinct peak, or like B and C or Fig. 
5, the cause shall be identified and removed. 


d) 


Process improvement may be required if the 
tolerance can be reduced and the distribution 
is normal. 
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e) The distribution of the data is barely within 
the specification limits and the mean is shifted 
to the left. This type of distribution may be 
due to three causes: 

1) Product below the lower 
specification limit would not appear 
under certain manufacturing 
conditions. 

2) Products below the lower 

specification limit might have been 
eliminated by 100% inspection. 
3) Off-specification products might 
have been included intentionally 
within the specification limits. This 
may possibly occur when 
measurement is left to the 
manufacturing department and the 
rate of off-specification products is 
controlled strictly. 
Action shall be taken to increase the 
mean to the centre of the 
specification range and reduce the 
variability of the distribution. 
f) The range of the distribution is smaller than 
that of the specification limits, but data are 
distributed beyond the lower specification 
limit because the mid point is deviated to the 
left. Action shall be taken to increase the mean. 
Most of the observations are distributed within 
the specification limits and the target value is 
also at the mid point of the specification range 
but some observations lie beyond the lower 
specification limit. We have the distribution 
like E of Fig.5. As this indicates non- 
normality, the cause shall be detected and 
removed. 


B-2.3. Identification of Causes by Stratification. 


B-2.3.1. When the range of distribution is large and 
the rate of off-specification is high, like (D) of Fig.6, 
stratification may be used for the identification of 
causes. The cause that influences the most shall be 
identified and the histogram shall be prepared after 
stratification with respect to the identified cause. 


When stratification with respect to one cause is 
insufficient, it shall be repeated with others. 


Examples of histograms after stratification are shown 
in Fig.7 and 8. 


There are two peaks in the histogram (A) in Fig .7. Since 
this histogram is drawn on the basis of observations on 
the products manufactured by two machines, 
stratification by the first and second machine 18 carried 
out to obtain the two histograms (B) and (C) of Fig. 7 
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ORIGINAL DISTRIBUTION AFTER STRATIFICATION AFTER STRATIFICATION 


BY MACHINE NO.1 


BY MACHINE NO. 2 
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Stratification reviles that the range of the distribution 
of the product manufactured by the first machine is 
more than that by the second machine and it’s mean is 
lower. 


B-2.3.2. Example. 


The finishing accuracy of a metal roll 200 mm in 
diameter is specified to be under 10/1 000 mm, but 
off-specification articles amounted to 8%. The first 
histogram in Fig.8 shows the data of the products 
classified in the classes with the width of 1/1 000 mm. 


This histogram shows apparently that the mean is 
deviated to the left and the tail is extended gradually to 
the right. 


The finishing of the roll was carried out by two 
operators, A and B. Accordingly, two independent 


FREQUENCY 


0 5 10 


Х 103 тт 


frequency distribution tables were prepared individually 
for A and B and histogram after stratification were 
prepared as shown in the second histogram of Fig.8. In 
this graph, the solid line indicates the data of A and the 
dotted line that of B. The number of observations for A 
is 41 and that for B is 109. These overlapping 
histograms show that the products manufacture by A 
are within the specification limit, but those by B are 
widely distributed and B is responsible for all the off- 
specification articles. 


Action must be taken to improve the working accuracy 
of operator B, when the working practice of B was 
examined carefully, it was found that B did not attach 
a roll to the grinder properly. His accuracy was 
improved to a large extent when he attached a roll in 
the same way as operator A. 


SPECIFICATION 


FREQUENCY 


X 103 mm 
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ANNEX C 
(Clause 9.4.1) 
PROFORMA FOR SUMMARIZATION OF DATA 


Class Mid-Value Transformed Frequency 
Interval ; Value (tally Marks) 


Standard deviation = с х 
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